buckets: document CopyFile operation for storage buckets#2375
Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
649ff55 to
1e0ec73
Compare
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
|
|
||
| Buckets serve as staging areas for data processing workflows. Process raw data, write intermediate outputs to a bucket, then promote the final artifact to a versioned [Dataset](./datasets) repository when the pipeline completes. This keeps your versioned repo clean while giving your pipeline fast mutable storage. | ||
|
|
||
| Note that transferring data from a Bucket to a repository without reuploading is not yet available, but is on the roadmap. |
There was a problem hiding this comment.
(reminder to please review your agentic PRs before setting them as ready to review)
There was a problem hiding this comment.
that comment is still valid, @mishig25
handled in #2375 (comment)
(reminder to please review your agentic PRs before setting them as ready to review)
I did review all the PRs I submitted: this PR, #2376, #2377 (as proof, you will see force-pushes and/or followup commits I made after the initial PR, and hand-edited PR descriptions in all three). And only making them as ready after my review so that the implementors of the features can catch inaccuracies such as above I didn't catch on my review/understanding of the feature
davanstrien
left a comment
There was a problem hiding this comment.
few small nit suggestions
Co-authored-by: Daniel van Strien <davanstrien@users.noreply.github.com>
Summary
hf buckets cp) and Python (api.copy_files) usage🤖 Generated with Claude Code
Note
Low Risk
Low risk documentation-only change that adds guidance for
hf buckets cp/HfApi.copy_files; no runtime or API behavior is modified.Overview
Adds a new “Copying files between repos and buckets” section to the storage buckets docs, describing server-side copying of Xet-tracked content from Hub repos or other buckets into a destination bucket.
Includes CLI (
hf buckets cp) and Python (HfApi.copy_files) examples, plus a note that only Xet-tracked files copy server-to-server (non-Xet files are downloaded and re-uploaded) and that source read + destination write access is required.Reviewed by Cursor Bugbot for commit 0a40880. Bugbot is set up for automated code reviews on this repo. Configure here.